PARALLEL COMPUTATION IS ESS 

NABARUN MONDAL AND PARTHA P. GHOSH 

Abstract. There are enormous amount of examples of Computation in nature, 
exemplified across multiple species in biology. One crucial aim for these com- 
putations across all life forms are their ability to learn and thereby increase the 
chance of their survival. In the current paper a formal definition of autonomous 
learning is proposed. From that definition we establish a General Purpose model 
for learning. Different implementations of the model are discussed. It is found 
f^ ' that for general purpose learning, the models capable of parallel execution would 

CNJ I be evolutionarily stable, and hence in Nature parallelism in computation is found 

^ • in abundance. 
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1. Introduction 

Computations are abound in nature, and it is not hard to fathom that one of 

the purposes of the natural computation is to learn. A more learned individual 

would thereby gather advantage over the others [T7], and survive. But then, how 

Q . the individuals of a species learn? What is the mechanism behind their learning? 

The extended Church Turing Thesis [14][15J [2] [3] [5] normally stated in the form: 

ly^ I any computation happening in nature must also have a Turing Machine analog; 

^ ■ suggests that no computation in nature can exceed the power of a Turing Machine 

^ , [6]. This intriguing thought of simulating nature on Turing Machines provoked 

the study of machine learning and artificial intelligence [7j j5]. 
^ ' One school of thought about machine learning follows the sequential way of cre- 

^ ■ ating machines which are capable of learning. In fact scholars from the sequential 

^yv ! school argue |14||15| that parallel systems won't have any added advantage over 

the sequential ones, because both are reducible to Turing Machines. 

Another school of thought [1] questions the sequential learning strategies, as 
'k>l ■ nature seems to be inherently running in parallel. The massive parallel structures 
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of neurones in brain |13j prompted the artificial neural network studies [9j [lOj, 
and it is now known that bacterial colony learn by pooling in their individual 
(bacterium) resources [12] which makes the learning strategy parallel by definition. 
Scholars from the parallel school ask the question :- 

Given Nature is inherently parallel, how useful the sequential way of learning 
going to be? 

It is well known that computationally both of these models actually yield the 
same power jH] [8] : every parallel system must have a sequential analog. The 
sequential models would take more time, less computing power, and less space, 
while the parallel ones would take less time, more computing power and more 
space. Clearly then, the solution to Natures parallelism lies not in the computing 
power of the abstract Turing Machine, but had to do with how each model has 
evolved in nature with no supervision (blind design) [16j. 

Can one then decisively argue the need for which parallel computation is pre- 
vailing in Nature? This question begs an answer because parallelism in execution 
is harder to attain in the artificial settings of man made computing machines. 
Artificial parallel systems are faster, but too simple when compared against the 
natural systems like human brain, which are slower but far more complex[14]. 

In the present paper we take up this exact problem. We notice that the real 
problem about systems evolving in nature is:- 

Given limited economy of operations possible in nature, what mode of computa- 
tion would evolutionarily arise and would become dominant? 

We discuss the abstract learning procedure in the Section [2j We establish that 
there is an abstract autonomous (blind, directionless) system capable of learning, 
which borrows two operations from biology : cloning and mutation. Section [3] 
discusses sequential and parallel design (Dawkins designoid, as any evolved, blind 
design is not a design in a true sense [TT]). In the Section H] we compare these 
two models from optimality or resources standpoint. We find that for general 
purpose learning, parallelism has evolutionary advantage. It is not only because 
it is faster, but also because using the operations clone and mutate no other 
computationally better organization is possible, and computationally it can not be 
improved. Finally in the Section [5] we put all these results together and show that 
once parallel strategy (in the sense of game theory) has invaded the population, it 
would become dominant, and stay dominant, because it is an evolutionarily stable 
startegy. Therefore, we conclude that, the theory of computation and game theory 
together can explain the prevalence of parallelism in computational circuitry in 
nature. 

2. Learning 

Everyone has an intuitive idea about what learning is. But that idea relies upon 
another intuitive idea of what is knowledge. For example Learning is, knowing 
what one did not know earlier. 
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In this sense, there are some properties of learning one can summaries:- 

Definition 2.1. Informal Description of Learning. 

After a system has learned, at some time t^ the following properties hold:- 

(1) The system could achieve something which was unachievable at t , t < ti. 

(2) The system would continue achieving everything it could achieve before ti ■ 

(3) The system should be autonomous, devoid of supervision by any intelligent 
agent. 

The third point needs elaboration. Once the system is set into motion, after 
that no tweaking should be done with the system. In specificity getting out of 
the system to perform system optimization is not allowed at all for autonomous 
learning. This jumping out of the system idea is elaborated heavily in [llj . 

In a formal sense then, if there is a set of knowledge associated with the system 
which can somehow be identified as the set /C(ti) at time ti, and at a later time 
^2 with the set /C(t2), then the statement a learning took place between ti to t2 is 
equivalent to the formal statement:- 

/C(ti) c /C(t2). 

The learned knowledge can be modelled by :- 

£(ti,t2) = /C(t2)\/C(ti) 

But these semi-formal definitions would not really formalize learning given that 
the set /C(t) was not formally defined. Can we formally quantify the set /C(t) ? 

At this point, we argue that the existence of /C(t) can not be measured without 
the effect of /C(t) exhibited by the system, which should only be identified by 
experimentation. If the Extended Church Turing thesis [14J[15J[4J is true, that 
is any model of computation has an analogous Turing Machine model, then, the 
effect of /C(t) can be found in pretty straight forward manner. We would need 
some definitions to reach to that formal definition of learning. 

Definition 2.2. System. 

Let a system S be defined by an underlying Universal Turing Machine TAi [4] 
with a rule table R{S) which accepts the language class C{S), by simulating the 
rule table R{S). 

Definition 2.3. Formal Definition of Learning. 

Let a system (definition \2.2\) at time t accepts the language class C{S{t)). The 
system has learned within time ti — t- t2,ti < t2 iff:- 

C{S{t^)) c C{S{t2)) 

The knowledge of the system fC{S) = C{S), and learning is precisely defined as:- 

C{t,,t2) = C{S{t2))\C{S{ti)) 
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Comparing the definitions f|2.ip with f|2.3p we can see that f|2.3p is just a type 
of (12.1 p which matches with the intuition. Knowledge acumulation is now exper- 
imentally verifiable. To show that a system has learned between (ti,t2), one has 
to find a string W2 ^ C(S'(ti)), but W2 G C{S{t2)). We also note that the strict 
subset inclusion C{S{ti)) C C{S{tj)) for i < j makes learning a filtration \T9\ [20j 
process. 

We note that the language class C{S) accepted by a system S relies upon the rule 
table R{S) to be used by the underlying Universal Turing Machine, and nothing 
else. Hence, learning process must somehow alter the rule table. 

But what is the natural mechanism? How an altered rule table ensures the earlier 
accepted language class still remains to be accepted even after the modifications 
of the rule table? Also, what kind of alternations to the rule table are permitted? 
From a biological point of view, a mechanism seems feasible. 

In Biology, the change (improvement) of a life form takes place due to a change 
in the biological rule book, that is, the genome jT6]. It has been argued [16j [17J 
that the rule book is the only thing that matters, and identifies the life form. 
The change in the life form, no matter how complex it looks like propagates in 
small, random changes in the genome. These changes have very small but non zero 
probability [17j, and can be modelled as a stochastic process [19j [20j. Two different 
life forms having same genetic code are biologically indistinguishable. Interestingly, 
the same argument holds true here. For a system, the rule table is the only thing 
that matters, there is no way to distinguish two different systems knowledge, given 
that they are having the same copy of the rule table! 

The process of copying the rule book is known as cloning, while any modification 
is known as mutation which generates a different rule book. We now define these 
concepts in a more formal way. 

Definition 2.4. Clone System. 

A system Sc is said to be clone of another system S iff the rule table of both the 
systems are identical. Formally:- 

R{S,) = R{S) 

Given a system S capable of making a clone Sc of itself, then, it can ensure that 
C{S) = C{Sc), that is both system would accept the same language class, thereby 
exhibiting same knowledge, as discussed before. 

Now, we need to discuss about how to alter the rule table information. To do 
that we borrow a groundbreaking idea from Godel. 

Definition 2.5. Godel Encoding (Godelization) . 

Let the alphabet!] = {sq, Si, S2, .-., Sb~i} with |S| = b. Let g : S — )■ {0, 1, 2, 3, ..., b} 
and be defined as:- 

g{x) = i when x = Si 
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Then, the Godelization of the string w = XQXi...Xk is defined as:- 



G{w) = Y,9i^rW 



r=0 

Definition 2.6. Godel Decoding (Reverse Godelization) . 

Let the alphabet S = {sq, Si, S2, •••, Sb-i}- with |S| = h. The reverse Godelization 
of X is defined as inverse function of Godelization i.e. pg : Z+ -^ S+ :- 

Pgi^) = '^ "i-ff G{w) = X 

To learn, we need a mechanism to change the rule table. In biology, such a 
mechanism exists, and is known as mutation. We use the same keyword, and 
define mutation formally. 

Definition 2.7. Mutation. Let w is a string from alphabet S. Let rc{x) be a 
class of computable functions r^ : Z_|_ — t- Z_|_ such that:- 

Vx G Z+ ; rc{x) 7^ x 
Then, the function n : S+ — )■ S+ 

/^H = Pgirc{G{w))) 
is called a mutation of the string w. 

Definition 2.8. Mutation of A System. 

Let the Godelization of the rule table R of a system S be defined as:- 

R^. = p{R) 

A system S^ having R{Sfj,) = R^ as a rule table is called mutated version of the 
system S. 

The notion of mutation of a system (definition 12.81 ) explains the idea how new 
rules can get created automatically. However, the old rules should not get deleted. 
These principles are explain in the axioms which follow next. 

Axiom 2.1. Natural Learning Systems Axioms. 

(1) Learning is as defined as definition \2.^ 

(2) Only allowed operations on rule tables are Glone and Mutation. 

We now show that a model exists which is capable of autonomous learning as in 
definition 12.31 It adhere to the axioms of ( 12.1 p and learns without any help from 
any supervisor. Simply speaking we show that a designoid [T7] can learn. 

Theorem 2.1. There is a learning process following Axiom (12. ip . 

Let S{ti) be a system at ti with a non deterministic Universal Turing Machine. 
Let any point of time t2, S cloned itself, (definition \2.4\ ) and then muted (definition 
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\2.7\ ) the clone system to generate a mutated system (definition \2.S\) . At t2 therefore 
there are two systems :- 

Sit2) = {S,,S^) 
The non deterministic universal Turing Machine of S can non deterministically 
use rule table R^ or R^. Given that Cn = C{S) \ C{Sfj^) ^ 0, this system has 
learned the language class Cn within the interval (^1,^2)- 

Proof of the theorem \2.1\ . We have C^ 7^ 0- Given a string Wg G C{S) and 
Ws ^ C{Sfj) ,by construction, this string would be accepted by the new system 
given the rule table R^ is used non deterministically. 

In the same way a string Wn G Cn such that Wn ^ C{S) would be accepted by 
the new system given the rule table R{S) is used. 

Therefore, the new system would continue accepting any string that was earlier 
acceptable, and also would accept new strings which it did not accept earlier. 
Therefore, the new system qualifies as a system which has learned. D 

3. Different Models Of Learner 

We have established that there exists a system depicted in Theorem 12.11 which 
is capable of learning by mutation and cloning. The systems which are capable of 
learning subsequently would be called learners. 

It is easy to notice that by the Theorem 12. ![ the way a learner really learns is 
adding a new rule table to its repository of existing rule tables. For example if the 
original rule system was i?i, then the learned system has two rule tables Ri, fi{Ri) 
which can be said as {i?i, i?2}- In the next level of learning, the rule tables would 
become :- 

{Ri, R2, fJ-iRi, R2}} = {-Ri, R2, R3} 
So, at every steps of learning, formally one new rule table gets added, and the 
learner conveniently must switch from one table to another to accept a string. 
Note that no internal shuffling of the rules are allowed, the rule tables are atomic 
building blocks by axiom (12. ip . With this idea in mind we can now formally define 
a learner as foUows:- 

Definition 3.1. A Learning System : Learner. 

A learner is a system S comprise of a set of rule tables C{S) = {Rk}, and a 
Universal Turing Machine. By simulating a rule table Rk it can accept strings 
Wk E Ck C C{Rk). It is capable of adding new rule table to C{S) via a process 
called learning. To learn the system can add a mutated copy of any of the Rk to 
C{S):- 

CiS),^ = CiS),^uMRk)} 
The language class accepted by the learner is :- 



C{S)c[jC{Rk 
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Definition 3.2. Complete Learner. 

Let's assume a learner S has a set of rule tables as C{S) = {Rk}- Let the 
individual language class for each R^ be C{Rk)- Let the language class accepted by 
the learner be Cl- The learner S is said to be complete iff:- 



CL = \JC{Rk 



As for mechanism of implementation of a learner, it can belong to two general 
classes tandem or sequential class, and parallel class. 

Definition 3.3. Sequential Learner. 

To accept a string a sequential learner sequentially picks one rule table Rk from 
C{S) and simulates it using the Universal Turing Machine, until either no unused 
rule table exists, or there is an accept. 

Which rule table to be used after which rule table becomes of importance now. 
There is no obvious answer to that. In fact this question, the inherent sequential 
nature limits the language class the sequential system can accept. This is the 
subject matter of the next theorem. 

Theorem 3.1. Language Class Accepted by the Sequential Learner. 

Sequential learner accepts a language class Cs which is strictly a subset of the 
union of the language class of all the rule tables RkS. 

Csc[jC{Rk) 

k 

Proof. We construct the exact language class accepted by the sequential learner. 
We begin by constructing the class of strings which would be accepted by the 
learner. Let Cu{k) be the set of strings those are accepted by rule table Rk, and , 
gets either accepted or rejected by any Rj. G C{S) \ {Rk}-, that is all string from the 
class Cu{k) makes the underlying Universal Turing Machine halt at all the other 
rules, but would not get into any infinite loop. 

Then, the language class accepted by the sequential learner is precisely :- 

Cs = [jC^{k) 

k 

It is obvious that Cu{k) C C{Rk) , and therefore, in general, when there exists 
at least a string in rule table Rk which makes the underlying Universal Turing 
machine simulating on rule table Rj get into infinite loop, we would have:- 

Cs(z[jC{Rk) 

k 

That completes the proof. D 
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It can be argued however that by cleverly ordering the selection of the rule tables 
one can possibly complete (definition 13. 2p the learner. But in general that will be 
impossible. Next theorem establishes this fact. 

Theorem 3.2. Sequential learner can not be completed. 

Algorithmic modification of a sequential learner into completeness is impossible. 

Proof. We demonstrate using the worst case scenario, which is ^Rk^Wk such that 
simulating Rj with input Wk gets the universal turing machine into infinite loop. 

Obviously, it is impossible to reduce such a system into a complete system. Now, 
assume that there is only one Wk G Rk such that simulating R^ with input Wk gets 
the universal turing machine into infinite loop. 

But that is unknown unless we are looking at the system from outside of the 
system. We must then already have a table which shows what strings from which 
rule table class produce a infinite loop in which rule table. 

The question is can we create such a table algorithmically? The answer is no, 
because we would never know which string would gets the simulation into infinite 
loop. Therefore, automatic creation of such a table is not possible. 

Hence, automatic completion of the sequential system, is not possible either, as 
was stated. D 



Therefore, it is now established by the Theorem 13.11 that a sequential learner 
does not have have closure of the language class for which it poses the rule tables 
never the less. Also, it is not possible to automatically complete it, as stated by 
Theorem 13.21 

Although the rules in the rule table lets it precisely accept the language class 
Cx it might still not accept all the strings from the class. This is obviously not 
efficient. 

The next learner, does not have that inefficiency, but it achieves that goal with 
extra processing units, and space. The next learner, of course is the parallel learner. 
It utilizes the concept of parallel running Turing Machines [8] having dedicated 
tapes each, but communicate with a monitoring Turing Machine using a shared 
tape. 

Definition 3.4. Parallel Learner. 

Let n = |£(iS)|. Then the parallel learner has n + 1 Universal Turing Machines, 
n of them are having single dedicated tape, and a shared tape which everyone shares. 

To accept an input string w G S+ w is to be supplied to the shared tape, from 
where all the Turing Machines copy the string into their own dedicated tape, and 
start running the simulation. If one of them TM.a could accept the string, TAia 
writes back a symbol V ^T^ to the shared tape. 

The last Turing Machine only monitors the shared tape for the symbol V. // it 
has found the symbol, it accepts the string, as the system has accepted it. 

Theorem 3.3. Language Class Accepted by the Parallel Learner. 
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Parallel learner accepts a language class Cp which is strictly the union of the 
language class of the all the rule tables R^s. 



Cp = [j C{Rk 



Proof. This is elementary. The problem of a simulated system getting into an infi- 
nite loop is solved by the n Turing Machine running parallel without affecting each 
others progress. Also the last Turing machine monitoring any of the simulation 
output ensures if any Turing machine halts with accept, the learner would halt. 
Therefore, the parallel system is strictly complete over the rules. D 

4. Comparisons of Learning Strategies 

In the previous section, we have established two learning strategies, one in tan- 
dem, or the sequential strategy (definition 13. 3p . another - the parallel strategy 
(definition 13. 4p . In the current section we establish the pros and cons of the dif- 
ferent strategies. 

Note that we are discussing a blind design, that is no conscious optimization, 
ever. None is looking at the system from the outside, and improving the design or 
the wiring of the system. 

Theorem 4.1. Completeness of the Learning Models. 

Parallel Learners are complete (definition \3.^) while Sequential Learners are not. 

Proof. The proof is a direct consequences of theorem 13.31 and theorem 13.11 D 

By this theorem, one clear thing we have established is that the parallel system 
is more optimal than the sequential system, in general. That is, a parallel system 
can use all it's resources to gain maximum coverage on the strings those are to 
be accepted, while a sequential learner can not. Also by Theorem 13.21 we have 
established that there is no autonomous way to complete the sequential system. 
So, parallel systems has inherent evolutionary advantage. 

But, this is not the only metric in which the learners to be measured for optimal- 
ity. In computer science, the time and space complexity are of utmost importance. 

Formally the time complexity question is how much time it takes to accept a 
string? 

It is not too hard to show that the parallel strategy wins here. We show it in 
the next theorem:- 

Theorem 4.2. Time Complexity Comparison of the Learners. 

If the time taken to accept a string w in sequential learner is ts{w) and in parallel 
learner is tp{w), then 

tp{w) < ts{w) Ww 
given both using same rule tables, and same Universal Turing Machines. 
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Proof. This is pretty elementary. For sequential learner acts in tandem, the time 
taken to accept a string is precisely the time taken to reject the string by all the 
other simulation previously plus the time taken to accept it. That gives:- 

fc-i 

ts{w) = '^triw,j) + ta{w,k) 

where the string gets accepted using the A;'th rule table, and tj.{w,j) are time 
taken to reject the string by jth rule table, with ta{w, k) is time taken to accept 
the string by simulating A;th rule table. 
However, for the parallel case :- 

tpiw) = ta{w,k) 

Therefore, 

k~l 

i=i 
which clearly establishes the theorem. D 

We should actually ask the storage complexity of the simulations. That is, to 
set up the sequential learner, and the parallel learner, how much storage space is 
needed? 

We note that the storage space a in either case is a function of the number of 
rule tables n, because one needs to incorporate the newly added rule table. Given 
the storage required to store the rule table Rk be ar{k), and an Universal Turing 
Machine at we have the precise relation as the next theorem:- 

Theorem 4.3. Storage Space Comparison of the Learners. 

If storage space of the sequential learner is o-g and in parallel learner is ap, then 

as = ap 
given both using same rule tables, and same Universal Turing Machines. 
Proof. We note that:- 

n 

OS = CTt + y^O'r(fe) 
k=l 

The parallel machine needs to have n + 1 Turing machines, but the rule for every 
universal turing machine is the same. And that is never going to get changed. 
Then only copy of the rule table for the Universal Turing machine itself suffices. 
Then 



Sp = at + y ^ar{k) 

k=l 

which would imply that as = ap. D 
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Now we ask the space complexity of the simulations done both by the sequential 
and the parallel learner. 

Theorem 4.4. Space Complexity Comparison of the Learners. 

If the space used to accept a string w in sequential learner is Ss{w) and in parallel 
learner is Sp{w), then 

Ssiw) < Sp{w) Ww 
given both using same rule tables, and same Universal Turing Machines. 

Proof. The proof is again elementary, we establish the precise relation between 
them. Assume that the space required to reject the string by jth simulation be 
Sr{w,j) and to accept by A;th simulation is Sa{w, k). Then, 

Ss{w) = SUp{Sr{w, 1), ..., Sr{w, k - l),Sa{w, k)} 

But, 

n 
Sp{w) = Sa{w,k) + ^ Sr{w,j) 
j=l jj^k 

which immediately establishes the theorem. D 

Theorem 4.5. Learning Complexity Comparison of the Learners. 

Let the expected time taken to learn a new language class Cn by sequential 
learner be Ts{C]s[) and that of a parallel learner be Tp{C]si). Then, 

rp(C^) < rs{CN) 

Proof. Given the language class Cn can actually be incorporated, that is Cn has 
no string for which the sequential learner would go into infinite loop, the learning 
actually requires cloning and a lucky mutation for both type of the systems. So, 
in which case the time taken is the same as expected time for the lucky mutation 
r^. In which case :- 

Tp{Cn) = Ts{Cn) = T^ 

But, when w G Cn makes the sequential system get into infinite loop, then 

t~s{Cn) — 7- oo, and generally:- 

TpiCN) < rs{CM) 
holds, as stated. D 

These theorems clearly establishes that when space is not premium, then, paral- 
lel learner would be a more optimal strategy for learning. In specificity, if accuracy 
and completeness is needed, then parallel learners are better then the sequential 
ones. 

But that is not all. The crux of this lies in the fact that autonomous learning 
in nature has to be inherently blind, triggered by chance mutations. Given that, 
there would be no way to ensure an out of the system decision making to further 
optimize the design of the resulting system. 
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Due to the nature of the bhng learning, even if optimisation is possible, there 
has to be parallelism to complete the learner. This intelligent ordering can evolve 
in nature, and is discussed in the next theorem. 

Theorem 4.6. Existence of a Hybrid Learning System 

There exists a hybrid model of learning which uses parallelism and serial model, 
and is complete. 

Proof. We prove it by constructing a hybrid system. We note that the halting 
problem establishes a partial order relation over the rule tables. It can be stated 
as :- 

Ri < Rj iff Wj G C{Rj) hangs the simulation of i?j. 

Given this partial order relation we can create sets Uk (for unrelated ) so that :- 

Ra G Uk iff ^Rb G Uk s.t. Ra < Rb or Rb < Ra 

Let U = {Uk}. Now individual rules Rki G Uk can be run in sequential. But cross 
set rules Rki G Uk and Rpi G Up can not be run in sequential. 

So, if a system is capable of running the n = \U\ parallel execution, we can make 
a complete hybrid system that runs related rules parallel, but unrelated rules can 
be run sequentially. 

This completes the proof. D 

We note that construction of such a system is not algorithmic, that is not com- 
putationally possible. The step of finding out the relations between language class 
rules is not computable. Hence, only blind mutation can create such a system, 
that is by chance. However, up until this section we have established that paral- 
lelism, in general, is inherent in nature, even if a Hybrid system is evolved, there is 
parallelism inherent in it. In the next section we show that why parallel learning 
strategy dominates nature. 

5. Parallel Strategies and Evolution 

In this section we finally ask the following question : 

Given the sequential and parallel strategies available, which strategy would dom- 
inate in nature? 

This obviously should depend upon the concept of which strategy pays off more 
for survival. But payoffs like this are in the realm of Game Theory [21j. In the 
context of the Game Theory payoffs are represented as numbers which represent 
the motivations of players. Payoffs may represent profit, quantity, that is any 
"utility". 

As resources are limited in nature, gaining advantage or loosing advantage can 
be modelled by a fixed sum game^l\ where a both the players are competing for 
a fixed sum in reward. However, we can generalise any fixed sum game into a zero 
sum game by setting the fixed amount at |21]|17][TB]. 
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Given accepting all strings having the same utility, we can have the payoffs 
proportional to the cardinality of the language class accepted by the strategy. In 
general, if learner pi plays with 5*1 learning strategy, and learner p2 plays with 
5*2 learning strategy, then, intuitively, the player pi is in advantage if |C(S'i)| > 
|C(S'2)|, and for p2 it is vice versa with \X\ denoting the cardinality of set X. 
Hence, payoff of 5*1 against 5*2 denoted by E{Si, S2) can be calculated as:- 

EiS,,S2) = \CiS^)\-\CiS2)\ 

When C{S) becomes infinite, to define the payoff formally we have to resort to the 
measure theory [19j. 

Definition 5.1. Utility Measure. 

Let A, B,X,Y C E"*" be a set of strings. A measure U : S+ — )• M_|_ is a utility 
measure, iff:- 

U{X) = iffX = 

implying:- 

AcB ^ U{B) >U{A) 

where U{X) > UiY) implies that the set X has more utility than Y. 

Note that with this measure in place, we do not need any assumptions about 
the utilities of different strings. It also glorifies the age old saying: "no learning is 
useless", only this time, more formally. 

Definition 5.2. Payoff between Learning Strategies. 

Let Si, S2 & S are two learning strategies. Let C{Si) denotes the language class 
accepted by the strategy Si. Let [JC{Sk) = C{S). LetU be a utility measure (15. ip 
defined over C{S). Then, the payoff of strategy Si against S2 is given by:- 

E{Si,S2) =U{C{Si)) -U{C{S2)) 

In particular, if Si = S2 = S then, 

E{S, S) = 

Now we make the bold claim that parallel strategy is evolutionarily stable. To do 
so we need to state the definition of an evolutionarily stable strategy jl6], |17).|21] 



Definition 5.3. Evolutionarily stable strategy (ESS). Let S is a set of strate- 
gies. Let E{S, T) represent the payoff for playing strategy S against strategy T. The 
strategy S is ESS iff one of the following conditions holds VT 7^ S with S,T ^ S :- 

(1) Strict Nash Equilibrium : 

^T eS; E{S, S) > E{T, S) 

(2) Maynard Smith's second Condition: 

yTeS; E{S, S) = E{T, S) and E{S, T) > E{T, T) 
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Now with this definition in hand, we estabhsh the criterion for ESS in the 
evolution of learning. 

Theorem 5.1. ESS for Learning Strategies. 

Let S be a set of learning strategies. Let Se & S he a learning strategy with:- 

^SkES; C{Sk) C C{Se) 
where C{S) is language class accepted by strategy S . Then, the strategy S^ is ESS. 

Proof. We note that :- 

EiS,, Sk) = U{C{S,) U Ce) - U{C{Sk)) 
where C{S(.) H Ce = 0, which implies :- 

V^fc e 5 ; E{Se, Sk) = U{Ce) =^ E{Se, Sk) > 
implying :- 

ySkeS; E{Sk, S,) = -U{Ce) =^ E{Sk, S,) < 
By definition we have E{Se, Se) = and therfore, 

ySkES; E{Se,Se)>E{Sk,Se) 

Comparing from definition (15.3^ rule 1), S^ is an ESS. In fact we note that this 
ESS is a strict Nash Equilibrium. D 

Now we establish that the parallel strategies are ESS. 

Theorem 5.2. Parallel learning strategies are ESS. 

Let S be a set of strategy such that \fS G S has rule table sets R{S) = TZ. Let 
Sp ^ S is a parallel strategy. Sp is an ESS. 

Proof. We note that the language class accepted by the tandem learners are a 
strict subset of the language class accepted by the parallel learner by the theorems 
fj3.1f3.3f4.ip . That is, if Ct is language class for the sequential learners, and Cp is 



the language class of parallel learners, 

Ct C Cp 

Now using the theorem 15. II we can immediately deduce that parallel strategies are 
ESS. D 

6. Summary and Future Works 

It is really surprising that the parallel execution model which is notorious for 
creating issues in everyday computing, is in fact, should be prevalent in nature. In 
this paper we have demonstrated why in nature parallel strategies are the optimally 
suited one. Once nature gets invaded by the parallel strategies, there is no going 
back, because parallel strategies are ESS. We established this fact using Church 
Turing Thesis, and an utility measure which tallies with common sense. This 
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demonstrates the power of computer science as a proper science, fully capable of 
describing natural phenomenon, outside the realm of rather artificial settings. 

Although we have established why nature prefers parallel execution, we did not 
establish a typical learning time for any language class, and the expected time to 
evolve into a hybrid learning system. These studies would be done in the future. 
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